Two-sided Exact Tests and Matching Confidence Intervals for Discrete Data
نویسنده
چکیده
There is an inherent relationship between two-sided hypothesis tests and confidence intervals. A series of two-sided hypothesis tests may be inverted to obtain the matching 100(1-α)% confidence interval defined as the smallest interval that contains all point null parameter values that would not be rejected at the α level. Unfortunately, for discrete data there are several different ways of defining two-sided exact tests and the most commonly used twosided exact tests are defined one way, while the most commonly used exact confidence intervals are inversions of tests defined another way. This can lead to inconsistencies where the exact test rejects but the exact confidence interval contains the null parameter value. The packages exactci and exact2x2 provide several exact tests with the matching confidence intervals avoiding these inconsistencies as much as possible. Examples are given for binomial and Poisson parameters and both paired and unpaired 2× 2 tables. Applied statisticians are increasingly being encouraged to report confidence intervals (CI) and parameter estimates along with p-values from hypothesis tests. The htest class of the stats package is ideally suited to these kinds of analyses, because all the related statistics may be presented when the results are printed. For exact two-sided tests applied to discrete data, a test-CI inconsistency may occur: the p-value may indicate a significant result at level α while the associated 100(1-α)% confidence interval may cover the null value of the parameter. Ideally, we would like to present a unified report (Hirji, 2006), whereby the p-value and the confidence interval match as much as possible. A motivating example I was asked to help design a study to determine if adding a new drug (albendazole) to an existing treatment regimen (ivermectin) for the treatment of a parasitic disease (lymphatic filariasis) would increase the incidence of a rare serious adverse event when given in an area endemic for another parasitic disease (loa loa). There are many statistical issues related to that design (Fay et al., 2007), but here consider a simple scenario to highlight the point of this paper. A previous mass treatment using the existing treatment had 2 out of 17877 experiencing the serious adverse event (SAE) giving an observed rate of 11.2 per 100,000. Suppose the new treatment was given to 20,000 new subjects and suppose that 10 subjects experienced the SAE giving an observed rate of 50 per 100,000. Assuming Poisson rates, an exact test using poisson.test(c(2,10),c(17877,20000)) from the stats package (throughout we assume version 2.11.0 for the stats package) gives a p-value of p = 0.0421 implying significant differences between the rates at the 0.05 level, but poisson.test also gives a 95% confidence interval of (0.024,1.050) which contains a rate ratio of 1, implying no significant difference. We return to the motivating example in the ‘Poisson two-sample’ section later. Overview of two-sided exact tests We briefly review inferences using the p-value function for discrete data. For details see Hirji (2006) or Blaker (2000). Suppose you have a discrete statistic t with random variable T such that larger values of T imply larger values of a parameter of interest, θ. Let Fθ(t) = Pr[T ≤ t;θ] and F̄θ(t) = Pr[T ≥ t;θ]. Suppose we are testing H0 : θ ≥ θ0 H1 : θ < θ0 where θ0 is known. Then smaller values of t are more likely to reject and if we observe t, then the probability of observing equal or smaller values is Fθ0(t) which is the one-sided p-value. Conversely, the onesided p-value for testing H0 : θ ≤ θ0 is F̄θ0(t). We reject when the p-value is less than or equal to the significance level, α. The one-sided confidence interval would be all values of θ0 for which the p-value is greater than α. We list 3 ways to define the two-sided p-value for testing H0 : θ = θ0, which we denote pc, pm and pb for the central, minlike, and blaker methods, respectively: central: pc is 2 times the minimum of the one-sided p-values bounded above by 1, or mathematically, pc = min { 1,2×min ( Fθ0(t), F̄θ0(t) )} . The name central is motivated by the associated inversion confidence intervals which are central intervals, i.e., they guarantee that the lower (upper) limit of the 100(1-α)% confidence interval has less than α/2 probability of being greater (less) than the true parameter. This is called the TST (twice the smaller tail method) by Hirji (2006). The R Journal Vol. 2/1, June 2010 ISSN 2073-4859 54 CONTRIBUTED RESEARCH ARTICLES minlike: pm is the sum of probabilities of outcomes with likelihoods less than or equal to the observed likelihood, or pm = ∑ T: f (T)≤ f (t) f (T) where f (t) = Pr[T = t;θ0]. This is called the PB (probability based) method by Hirji (2006). blaker: pb combines the probability of the smaller observed tail with the smallest probability of the opposite tail that does not exceed that observed tail probability. Blaker (2000) showed that this p-value may be expressed as pb = Pr [γ(T) ≤ γ(t)] where γ(T) = min { Fθ0(T), F̄θ0(T) } . The name blaker is motivated by Blaker (2000) which comprehensively studies the associated method for confidence intervals, although the method had been mentioned in the literature earlier, see e.g., Cox and Hinkley (1974), p. 79. This is called the CT (combined tail) method by Hirji (2006). There are other ways to define two-sided p-values, such as defining extreme values according to the score statistic (see e.g., Hirji (2006, Chapter 3), or Agresti and Min (2001)). Note that pc ≥ pb for all cases, so that pb gives more powerful tests than pc. On the other hand, although generally pc > pm it is possible for pc < pm. If p(θ0) is a two-sided p-value testing H0 : θ = θ0, then its 100(1− α)% matching confidence interval is the smallest interval that contains all θ0 such that p(θ0) > α. To calculate the matching confidence intervals, we consider only regular cases where Fθ(t) and F̄θ(t) are monotonic functions of θ (except perhaps the degenerate cases where Fθ(t) = 1 or F̄θ(t) = 0 for all θ when t is the maximum or minimum). In this case the matching confidence limits to the central test are (θL,θU) which are solutions to:
منابع مشابه
Exact maximum coverage probabilities of confidence intervals with increasing bounds for Poisson distribution mean
A Poisson distribution is well used as a standard model for analyzing count data. So the Poisson distribution parameter estimation is widely applied in practice. Providing accurate confidence intervals for the discrete distribution parameters is very difficult. So far, many asymptotic confidence intervals for the mean of Poisson distribution is provided. It is known that the coverag...
متن کاملDealing with discreteness: making 'exact' confidence intervals for proportions, differences of proportions, and odds ratios more exact.
'Exact' methods for categorical data are exact in terms of using probability distributions that do not depend on unknown parameters. However, they are conservative inferentially. The actual error probabilities for tests and confidence intervals are bounded above by the nominal level. This article examines the conservatism for interval estimation and describes ways of reducing it. We illustrate ...
متن کاملConfidence Intervals for the Power of Two-Sided Student’s t-test
For the power of two-sided hypothesis testing about the mean of a normal population, we derive a 100(1 − alpha)% confidence interval. Then by using a numerical method we will find a shortest confidence interval and consider some special cases.
متن کاملTwo-Sided Tolerance Interval for Exponential Distribution Based on Records
Tolerance interval is a random interval that contains a proportion of the population with a determined confidence level and is applied in many application fields such as reliability and quality control. In this paper, based on record data, we obtain a two-sided tolerance interval for the exponential population. An example of real record data is presented. Finally, we discuss the accuracy of ...
متن کاملAn exact confidence set for two binomial proportions and exact unconditional confidence intervals for the difference and ratio of proportions
An exact joint confidence set is proposed for two binomial parameters estimated from independent samples. Its construction relies on inverting the minimum volume test, a two-dimensional analogue of Sterne’s test for a single probability. The algorithm involves computer-intensive exact computation based on binomial probabilities. The proposed confidence set has good coverage properties and it pe...
متن کامل